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Abstract. We analyze a simple random process in which a token is moved in the interval 
A — {0, . . . ,n}: Fix a probability distribution /i over {1, . . . ,n}. Initially, the token is 
placed in a random position in A. In round t, a random value d is chosen according to fi. 
If the token is in position a > d, then it is moved to position a — d. Otherwise it stays 
put. Let T be the number of rounds until the token reaches position 0. We show tight 
bounds for the expectation of T for the optimal distribution jj,. More precisely, we show 
that mmfj,{Ef^{T)} — O ((logn)^). For the proof, a novel potential function argument is 
introduced. The research is motivated by the problem of approximating the minimum of 
a continuous function over [0, 1] with a "blind" optimization strategy. 



1. Introduction 

For a positive integer n, assume a probability distribution ^ on X = {1, . . . , n} is given. 
Consider the following random process. A token moves in ^ = {0, . . . , n}, as follows: 

• Initially, place the token in some position in A. 

• In round t: The token is at position a € A. Choose an element d from X at random, 
according to ^. If d < a, move the token to position a — d (the step is "accepted"), 
otherwise leave it where it is (the step is "rejected"). 
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When the token has reached position 0, no further moves are possible, and we regard the 
process as finished. 

At the beginning the token is placed at a position chosen uniformly at random from 
{1, . . . ,n} = A — {0}. (For simplicity of notation, we prefer this initial distribution over 
the possibly more natural uniform distribution on {0, . . . ,n}. Of course, there is no real 
difference between the two starting conditions.) Let T be the number of rounds needed 
until position is reached. A basic performance parameter for the process is E^(r). As 
fi varies, the value E^(T) will vary. The probability distribution /u may be regarded as a 
strategy. We ask: How should /i be chosen so that E^(T) is as small as possible? 

It is easy to exhibit distributions fi such that E^(T) = 0((logn)^). (All asymptotic 
notation in this paper refers to n ^ oo.) In particular, we will see that the "harmonic 
distribution" given by 

Aihar(d) = , rr > for 1 < d < n, (1.1) 
a ■ Hn 

where Hn = Yli<d<n 1 ''^^^ harmonic number, satisfies Ei^^^^{T) = 0((logn)^). As 

the main result of the paper, we will show that this upper bound is optimal up to constant 
factors: E^(r) = ^((logn)^), for every distribution /x. For the proof of this lower bound, 
we introduce a novel potential function technique, which may be useful in other contexts. 

1.1. Motivation and Background: Blind Optimization Strategies 

Consider the problem of minimizing a function / : [0,1] ^ R, in which the definition of 
/ is unknown: the only information we can gain about / is through trying sample points. 
This is an instance of a black box optimization problem [Tj. One algorithmic approach to 
such problems is to start with an initial random point, and iteratively attempt to improve it 
by making random perturbations. That is, if the current point is x S [0, 1], then we choose 
some distance d G [0, 1] according to some probability distribution /x on [0, 1], and move to 
X + d or X — d if this is an improvement. The distribution fi may be regarded as a "search 
strategy". Such a search is "blind" in the sense that it does not try to estimate how close 
to the minimum it is and to adapt the distribution ^ accordingly. The problem is how to 
specify fi. Of course, an optimal distribution fi depends on details of the function /. 

The difficulty the search algorithm faces is that for general functions / there is no infor- 
mation about the scale of perturbations which are necessary to get close to the minimum. 
This leads us to the idea that the distribution might be chosen so that it is scale invariant, 
meaning that steps of all "orders of magnitude" occur with about the same probability. 
Such a distribution is described in [2]. One starts by specifying a minimum perturbation 
size e. Then one chooses the probability density function h{t) = l/{pt) for e < t < 1, and 
h{t) = otherwise, where p = ln(l/e) is the precision of the algorithm. (A random number 
distributed according to this density function may be generated by taking d = exp{—pu), 
where u is uniformly random in [0, 1].) 

For general functions /, no analysis of this search strategy is known, but in experi- 
ments on standard benchmark functions it (or higher dimensional variants) exhibits a good 
performance. (For details see [4].) From here on, we focus on the simple case where / is 
unimodal, meaning that it is strictly decreasing in [0,a;o] and strictly increasing in [xo,l], 
where xq is the unknown minimum point. 

Remark 1.1. If one is given the information that / is unimodal, one will use other, de- 
terministic search strategies, which approximate the optimum up to e within 0(log(l/e)) 
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steps. As early as 1953, in [3|, "Fibonacci search" was proposed and analyzed, which for a 
given tolerance e uses the optimal number of steps in a very strong sense. 

The "blind search" strategy from [4j can be applied to more general functions /, but 
the following analysis is valid only for unimodal functions. If the distance of the current 
point X from the optimum xq is r > 2e then every distance d with ^ < d <t will lead to a 
new point with distance at most r/2. Thus, the probability of at least halving the distance 
to Xq in one step is at least \ J^^2 ^ = ^> which is independent of the current state x. 
Obviously, then, the expected number of steps before the distance to xq has been halved is 
2p/ln2. We regard the algorithm to be successful if the current point has distance smaller 
than 2e from xq. To reach this goal, the initial distance has to be halved at most log(l/e) 
times, leading to a bound of 0(log(l/e)^) for the expected number of steps. 

The question then arises whether this is the best that can be achieved. Is there perhaps 
a choice for /i that works even better on unimodal functions? To investigate this question, 
we consider a discrete version of the situation. The domain of f is A = {0, . . . ,n}, and 
/ is strictly increasing, so that / takes its minimum at xq = 0. In this case, the search 
process is very simple: the actual values of / are irrelevant; going from a to a + d is never an 
improvement. Actually, the search process is fully described by the simple random process 
from Section[TJ How long does it take to reach the optimal point 0, for a // chosen as cleverly 
as possible? For fj. = /ihan we will show an upper bound of 0((logn)^), with an argument 
very similar to that one leading to the bound 0(log(l/e)^) in the continuous case. The 
main result of this paper is that the bound for the discrete case is optimal. 

1.2. Formalization as a Markov chain 

For the sake of simplicity, we let from now on [a, b] denote the discrete interval {a, . . . , 6} 
if a and b are integers. Given a probability distribution on the Markov chain 

R = (i?o, Ri, ■ ■ ■) is defined over the state space A = [0, n] by the transition probabilities 

{/i(a — a') for a' < a; 

1 - Ei<d<a Kd) for a' = a; 
for a' > a. 

Clearly, is an absorbing state. We define the random variable T = min{t \ Rt = 0}. Let 
us write E^(T) for the expectation of T if Rq is uniformly distributed in A — {0} = [l,n]. 
We study E^(r) in dependence on /x. In particular, we wish to identify distributions fi that 
make E^(r) as small as possible (up to constant factors, where n is growing). 

Observation 1.2. If ^(1) = then E^(r) = oo. 

This is because with probability - position 1 is chosen as the starting point, and from 
state 1, the process will never reach if /i(l) = 0. As a consequence, for the whole paper 
we assume that all distributions that are considered satisfy 

/i(l)>0. (1.2) 

Next we note that it is not hard to derive a "closed expression" for E^(T). Fix fi. For 
a £ A, let F{a) = fi{[l,a]) = X^i<rf<(j ^(c?)- We note recursion formulas for the expected 
travel time Ta = E^(T \ Rq = a) when starting from position a G A. It is not hard to 
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obtain (details are omitted due to space constraints) 

EMm = -- 2^ F{a,)--.F{a,) ' ^^'^^ 

where the sum ranges over all 2" — 1 nonempty subintervals [ai,a£] of By definition 

of F{a), we see that E^(T) is a rational function of (/u(l), . . . , fJ^{n)). By compactness, there 
is some that minimizes E^(T). Unfortunately, there does not seem to be an obvious way 
to use to gain information about the way E^(T) depends on /i or what a distribution 
/i that minimizes E^(T) looks like. 

2. Upper bound 

In this section, we establish upper bounds on E^(T). We split the state space A and 
the set X of possible distances into "orders of magnitude", arbitrarily choosing 2 as the 
baseQ Let L = [lognj, and define li = [2*,2*+i), for < i < L, and h = [2^,n]. Define 

Pi = fJ-id), for < i < L. 
deli 

Clearly, then, po + pi + ■ ■ ■ + pi = 1. To simplify notation, we do not exclude terms that 
mean pi for i < or i > L. Such terms are always meant to have value 0. Consider the 
process R = {Rq,Ri, . . .). Assume t >1 and z > 1. If Rt-i > 2* then all numbers d £ 
will be accepted as steps and lead to a progress of at least 2*~^. Hence 

Pr{Rt < Rt^i - 2'-^ I Rt-i > 2') > pi^i. 

Further, if Rt-i G li, we need to choose step sizes from at most twice to get below 2*. 
Since the expected waiting time for the random distances to hit twice is 2/pj_i, the 
expected time process R remains in /j is not larger than 2/pi_i. 

Adding up over 1 < i < L, the expected time process R spends in the interval [2, a], 
where a G Ij is the starting position, is not larger than 

2 2 2 2 

+ + ... + — + 



Pj-l Pj'2 Pl Po 

After the process has left Ii = [2, 3], it has reached position or position 1, and the expected 
time before we hit is not larger than 1/po = l//i(l). Thus, the expected number Ta of 

steps to get from a G to satisfies Tn < — 1 !-••• + — + — • This implies the 

bound 

^ 2 2 2 3 

E^(r) < + + ... + _ + _, 

PL^l PL-2 Pl PO 

for arbitrary fi. If we arrange that 

1 , , 

Po = ■■■ =PL-i = 2, (2.1) 

we will have Ta < {2j + 1)L < (2(loga) + l)(logn) = 0((log a)(log n)) = 0{{lognf ). 
Clearly, then, E^(r) = 0((logn)^) as well. The simplest distribution /x with (j2.1|) is the 
one that distributes the weight evenly on the powers of 2 below 2^: 

1/L, if d = 2\ <i < L, 
0, otherwise. 



^log means "logarithm to the base 2" throughout. 
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Thus, E^p^^2(^) = 0{{logn)'^). The "harmonic distribution" defined by (II. ip satisfies pi ~ 
(ln(2*+i) - ln{2'))/Hn ~ ln2/ln(n) = l/logsn, and we also get Ta = 0((log o)(log n)) and 
E^j^_^^(T) = 0((logn)^). More generally, all distributions ^ with • • • jPl-i ^ a/L, where 
a > is constant, satisfy E^(T) = 0((logn)^). 

3. Lower bound 

We show, as the main result of this paper, that the upper bound of Section [2] is optimal 
up to a constant factor. 

Theorem 3.1. E^(T) = r2((logn)^) for all distributions fi. 

This theorem is proved in the remainder of this section. The distribution ^ is fixed 
from here on; we suppress /i in the notation. Recall that we may assume that > 0. We 
continue to use the intervals Iq, Ii, I2, ■ ■ ■ , II that partition [1, n], as well as the probabilities 
Pi, < i < L. 

3.1. Intuition 

The basic idea for the lower bound is the following. For the majority of the starting 
positions, the process has to traverse all intervals Il-2, Il-3, ■ ■ ■ iIi^Iq- Consider an interval 
Jj. If the process reaches interval /j+i, then afterwards steps of size 2*"*"^ and larger are 
rejected, and so do not help at all for crossing Ij. Steps of size from /j+i, /j, /j-i, Ii-2 may 
be of significant help. Smaller step sizes will not help much. So, very roughly, the expected 
time to traverse interval Ij completely when starting in Jj+i will be bounded from below by 



since + pi + pi-i + pi-2) is the waiting time for the first step with a "significant" 

size to appear. If it were the case that there is a constant /3 > with the property that for 
each < i < -L — 1 the probability that interval Jj+i is visited is at least /3 then it would 
not be hard to show that the expected travel time is bounded below by 



(We picked out only the even i = 2j to avoid double counting.) Now the sum of the 
denominators in the sum in (j3.ip is at most 2, and the sum is minimal when all denominators 
are equal, so the sum is bounded below by /?• (L/2) • (L/2)/2 = P-L'^/8, hence the expected 
travel time would be Q{L'^) = r2((logn)^). 

It turns out that it is not straightforward to turn this informal argument into a rig- 
orous proof. First, there are (somewhat strange) distributions for which it is not the 
case that each interval is visited with constant probability. (For example, let /u(d) = 
5^-1 . (5 _ 1)/(S" - 1), for a large base B like B = n^. Then the "correct" jump directly 
to has an overwhelming probability to be chosen first 0) Even for reasonable distributions 
11, it may happen that some intervals or even blocks of intervals are jumped over with high 
probability. This means that the analysis of the cost of traversing Jj has to take into account 
that this traversal might happen in one big jump starting from an interval Ij with j much 

^The authors thank Uri Feige for pointing this out. 



1 



Pi+l +Pi+Pi-i +Pi-2' 



l<j<L/2 




(3.1) 
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larger than i. Second, in a formal argument, the contribution of the steps of size smaller 
than 2*~^ must be taken into account. 

In the remainder of this section, we give a rigorous proof of the lower bound. For this, 
some machinery has to be developed. The crucial components are a reformulation of process 
R as another process, which as long as possible defers decisions about what the (randomly 
chosen) starting position is, and a potential function to measure how much progress the 
process has made in direction to its goal, namely reaching position 0. 

3.2. Reformulation of the process 

We change our point of view on the process R (with initial distribution uniform in 
[1, n]). The idea is that we do not have to fix the starting position right at the beginning, but 
rather make partial decisions on what the starting position is as the process advances. The 
information we hold on for step t is a random variable St, with the following interpretation: 
if St > then Rt is uniformly distributed in [1, St]; if St = then Rt = 0. 

What properties should the random process S = (Sq, Si, . . .) on [0,n] have to be a 
proper model of the Markov chain R from Section 11.21 .'' We first give an intuitive description 
of process S, and later formally define the corresponding Markov chain. Clearly, Sq = n: 
the starting position is uniformly distributed in [l,n]. Given s = St-i G [0, n], we choose a 
step length d from X, according to distribution /x. Then there are two cases. 

Case 1: d > s. — Ifs>l, this step cannot be used for any position in [1, s], thus we 
reject it and let St = s. If s = 0, no further move is possible at all, and we also reject. 

Case 2: d < s. — Then s > 1, and the token is at some position in [l,s]. What 
happens now depends on the position of the token relative to d, for which we only have a 
probability distribution. We distinguish three subcases: 

(i) The position of the token is larger than d. — This happens with probability {s—d) / s. 
In this case we "accept" the step, and now know that the token is in [l,s — d\, 
uniformly distributed; thus, we let St = s — d. 

(ii) The position of the token equals d. — This happens with probability 1/s. In this 
case we "finish" the process, and let St = 0. 

(iii) The position of the token is smaller than d. — This happens with probability 

In this case we "reject" the step, and now know that the token is in [l,d — 1], 

uniformly distributed; thus, we let St = d — 1. 
Clearly, once state is reached, all further steps are rejected via Case 1. 

We formalize this idea by defining a new Markov chain = (5o, 5*1, . . .), as follows. The 
state space is A = [0, n]. For a state s' , we collect the total probability that we get from s 
to s' . If s' > s, this probability is 0; if s' = s, this probability is '^s<d<n f^i^) ~ ~ ^i^)'^ 
if s' = 0, this probability is Y^k^ks t^{d)/s = F{s)/s; if 1 < s' < s, this probability is 
(/i(s' + 1) + fi{s — s')) ■ s' / s, since d could be s' + 1 or s — s' . Thus, we have the following 
transition probabilities: 



Again, several initial distributions are possible for process S. The version with initial 
distribution with Pr(S'o = n) = 1 is meant to describe process R. Define the stopping time 




Ts = min{t \St = Q]. 
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We note that it is sufficient to analyze process S (with the standard initial distribution). 
Lemma 3.2. E(r) = B{Ts). 

Proof. For < s < n, consider the version of process R induced by choosing the 
uniform distribution on [l,s] (for s > 1) resp. {0} (for s = 0) as the initial distribution. 
We let 

A^'^ = E(min{t | = 0}). 

Clearly, = E(r) and ^(°) = 0. We derive a recurrence for (^(°), . . . , A^). Let s > 1, 
and assume the starting point Rq is chosen uniformly at random from [1, s]. We carry out 
the first step of R^^^ , which starts with choosing d. The following situations may arise. 

(i) d > s. — This happens with probability 1 — F(s) < 1. This distance will be rejected 
for all starting points in [1, s], so the expected remaining travel time is A^^^ again. 

(ii) 1 < d < s. For each d, the probability for this to happen is /x(d). For the starting 
point -Ro there are three possibilities: 

- Rq & [l,d — 1] (only possible if d > 1). — This happens with probability 
The remaining expected travel time is A^'^~^\ 

- Ro = d. — This happens with probability ^. The remaining travel time is 0. 

- i?o £ [d + 1, s] (only possible if d < s). — This happens with probability 
The remaining expected travel time in this case is 

We obtain: 

= 1 + (1 - F(5))^W + J2 t'id) ■ A(''-^^ + ^ . A(^-A . 

l<d<s ^ * ^ ^ 

We rename d — 1 into s' in the first sum and s — d into s' in the second sum and rearrange 
to obtain 

^^'^ = W)'{^^ ^ ^^^^ ~ ''^^ ■ ^''^'^ ■ ^^''^) ■ ^^'^^ 

Next, we consider process S. For < s < n, let S^^^ be the process obtained from S by 
choosing s as the starting point. Clearly, S^^^ always sits in 0, and S'^") is just S. Let 

S(^) = E(min{t I S^'^ = 0}), 

the expected number of steps process S needs to reach when starting in s. Then B^^^ = 
and = E{Ts). We derive a recurrence for (-B(o), . . . , Let s > 1. Carry out the 

first step of S^^\ which leads to state s'. The following situations may arise. 

(i) s = s' > 1. — This occurs with probability 1 — F{s), and the expected remaining 
travel time is B^^^ again. 

(ii) s' = 0. — In this case the expected remaining travel time is B^^^ = 0. 

(in) s > s' > 1. — This occurs with probability {n{s' + 1)+ fi{s — s')) ■ s' /s. The expected 
remaining travel time is B^^'\ 
Summing up, we obtain 

S(-) = 1 + (1 - F{s))B^'^ + Yl (/^(^' + 1) + /^(« - ^0) • (^7^) • 

l<s'<s 
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Solving for B^'^^ yields: 

= . ( 1 + ^ (^(.' + 1) + f,(s - s')) . (s'/s) . ) . (3.3) 



l<s'<s 



bmce ^(0) = = 5(0) and the recurrences (j3.2p and (j3.3p are identical, we have E(r) 
^(n) = = E(rs), as claimed. 



3.3. Potential function: Definition and application 

We introduce a potential function <I> on the state space A = [0, n] to bound the progress 
of process S. Our main lemma states that for any s £ A, for a random transition from 
Si = s to S'i+i the expected loss in potential is at most constant (i.e., E(<I>(5j+i) — ^{Si) \ 
Si = s) = 0(1)). This implies that E{Ts) = 0,{^{So)). Since the potential function will 
satisify ^(iS'o) = r2(log^n), the lower bound follows. 

We start by trying to give intuition for the definition. A rough approximation to the 
potential function we use would be the following: For interval Ij there is a term 

i^i = ^ (3.4) 

for some constant c with ^ <c< 1, e. g., c = 1/V2. For later use we note that 

E ^r^= E E p.-i^-^'= E E ^'-'' = 0(1), 0.5) 

l<i<L l<i<LO<j<L 0<j<L l<i<L 

since Ylo<j<LPj ~ ^ Ylk>o'^'' ~ T^- "^^^ term V'i tries to give a rough lower bound 
for the expected number of steps needed to cross li in the following sense: The summands 
Pj ■ c'-'"*! reflect the fact that step sizes that are close to /j will be very helpful for crossing 
/j, and step sizes far away from /j might help a little in crossing /j, but they do so only to 
a small extent (j <C i) or with small probability (j ^ i). The idea is then to arrange that 
a state s £ 1^ has potential about 

^k = Y.A- (3.6) 

i<k 

It turns out that analyzing process S on the basis of a potential function that refers to the 
intervals /j is possible but leads to messy calculations and numerous cases. The calculations 
become cleaner if one avoids the use of the intervals in the definition and in applying the 
potential function. The following definition derives from ()3.4p and (j3.6p by splitting up the 
summands V'i into contributions from all positions a € It and smoothing out the factors 
^\j-i\ ^ 2l^-*l/2, for a e li and d e Ij, into 2"! i°s«-i°g'^l/2, which is for a < d and 

\fdfa for d < a. This leads to the following. Assumption (jl.2p guarantees that in the 
formulas to follow all denominators are nonzero. 

Definition 3.3. For 1 < a < n let 

l<d<n l<d<a a<d<n 



Whenever in the following we use letters a,b,d, the range [l,n] is implicitly understood. 
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and ipa = l/{aaa)- For < s < n define $(s) = Si<a<s93a- The random variable $t, 
t = 0, 1, 2, . . ., is defined as $t = ^{St). 

We note some easy observations and one fundamental fact about ^"4, t > 0. 

Lemma 3.4. 

(a) ^t, t'>Q, is nonincreasing for t increasing. 

(b) = ^ 5t = 0. 

(c) <I>o = ^((log?^)^) (*&o is a number that depends on n and fi). 

Proof, (a) is clear since St, t > 0, is nonincreasing and the terms fa are positive. — (b) is 
obvious since <I>f = if and only if ^{St) is the empty sum, which is the case if and only 
if St = 0. — We prove (c). In this proof we use the intervals /j and the probabilities pi, 
< i < L, from Section [2j We use the notation i{a) = [logaj = max{z | 2* < a}. We 
start with finding an upper bound for by grouping the summands in according to the 
intervals. Let c = l/\/2- 



l<d<n 

j<i(a) d(^Ij j>i{a) d(^Ij 

jr<i(a) j>i{a) \ 0<j<L 

Hence 



E'^- = E 



1 2* 



> 



with ipi from (j3.4p . Thus, 

*■> > E (") 

0<j<L 

Let = 4:c/ipi be the reciprocal of the summand for i in (j3.7p . < i < L. From (j3.5p we 
read off that X]o<j<L — ^' some constant k. Now X]o<i<L il" with Ylo<i<L Ui < k is 
minimal if all Ui are equal to k/L. Together with (13. 7p this entails 'I'o ^ -^^ " {L/k) = I? jk = 
i7((logn)^), which proves part (c) of Lemma 13.41 ■ 



The crucial step in the lower bound proof is to show that the progress made by process 
S in one step, measured in terms of the potential, is bounded: 

Lemma 3.5 (Main Lemma). There is a constant C such that for < s < n, we have 
E($t_i - ^t I St^i = s)<C. 

The proof of Lemma 13.51 is the core of the analysis. It will be given in Section [3.41 To 
prove Theorem 13.11 we need the following lemma, which is stated and proved (as Lemma 
12) in [2\. (It is a one-sided variant of Wald's identity.) 

Lemma 3.6. Let Xi, X2, . . . denote random variables with bounded range, let g > and 

let T = min{t \ Xi -\ ^Xt> g}. If B{T) < 00 and E{Xt \ T > t) < C for all t e N, 

then E(r) > g/C. 
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Proof of l3.lt Since S** = if and only if $4 = (Lemma l3.4( b)). the stopping time T$ = 
min{t I $j = 0} of the potential reaching satisfies r$ = Ts- Thus, to prove Theorem 13.11 
it is sufficient to show that E(r$) = 0((logn)^). For this, we let Xt = — $4, the 
progress made in step t in terms of the potential. By Lemma [33| E(Xt | St-i = s) < C, 
for all s > 1, and hence 

E(Xt \T>t) = E{Xt I ^{St-i) > 0) < C . 

Observe that Xi-\ ^Xt = ^Q-^t and hence r$ = min{t | XiH \-Xt > $o}- Applying 

Lemma 13.61 and combining with Lemma 13.41 we get that E(T$) > ^o/C = r2((logn)^), 
which proves Theorem 13. 11 ■ 

The only missing part to fill in is the proof of Lemma 13.51 

3.4. Proof of the Main Lemma (Lemma 13. 5|) 

Fix s £ [l,n], and assume St-i = s. Our aim is to show that the "expected potential 
loss" is constant, i.e., that 

E($t - <^t-i I St-i =s) = 0(1). 

Clearly, E($t - '^t-i \ S't-i = s) = Y.o<x<s^is^x)^ where 

A{s,x) = {^{s) - ^{x)) ■ Pr{St = X I St-i = s). (3.8) 

We show that X^o<x<s ^('^' ^) bounded by a constant, by considering A(s,s), A(s,0), 
and X^i<i.<s '^(•s, 2;) separately. 

For X = s, the potential difference <^(s) — ^(x) is 0, and thus 

A(s,s) = 0. (3.9) 



Bounding A(s, 0): According to the definition of the process S, a step from St-i = s to 
St = has probability F{s)/s. Since <I>(0) = 0, the potential difference is ^{s). Thus, we 
obtain 

\d<s J \a<s / a<s 



b<a a<b<n 



< - ■ 6(a), where 6(a) 
s ^ 



b<s 



b<a a<b<s 

We bound 6{a). For b < a and /i(6) 7^ 0, the quotient of the summands in the numerator 
and denominator of 6{a) that correspond to b is l/\/a& < ^/a/a < ^/s/a. For a < b and 
fi{b) 7^ 0, the quotient is Vb/a^^'^ < y/s/a. Thus, 6{a) < This implies (recall that 

= Yla<d<s 

A(s,0) <^.y^s/a<^< i!^(£)±l < 2. (3.10) 
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Bounding Yli<x<s ^i^^x): Assume 1 < x < s. According to the definition of the process 
S, ~ 

Pr(5t„i = X \ St = s) = - ■ (fj,{x + 1) + n{s - x)). 

s 

The potential difference is <I>(s) — ^{x) = Ylx<a<s Va- Thus we have 

l<x<s l<x<s x<a<s l<a<s 

where Xa = (fa ■ ^i<x<a + 1) and ja = ■ Y^i<x<a f^i^ ~ ^) ^- We bound Xa and ja 
separately. Observe first that 

Xa= V^a • ^ n{x){x - 1) 

2<x<a 

^ fi{x)ix-l) J2 t'(b){b-l) 

^ l<x<a ^ l<b<a , 

l<b<a a<b<n l<b<a 

(We used the definition of (pa, and omitted some summands in the denominator.) Recall 
that //(I) > 0, so the denominator is not zero. For each b < a we clearly have fJ'{b)(b — 
1) ^ ^{b)^fab, thus the sum in the numerator in ()3.12p is smaller than the sum in the 
denominator, and we get Aa < 1. 
Next, we bound 7a for a < s: 

7a = V'a • ^ - a;) X = (^a • ^ ^^{x) (s - x) 

l<x<a s—a<x<s 

Y Y f^ix){s-x) 

s—a<x<a ma,x{a,s~a}<x<s 

Y Y l^ibW^Wb 

l<b<a a<b<n 

The denominator is not zero because > 0. Hence, if /u(x) = for all s — a < a; < s, 
then 7a = 0. Otherwise, by omitting some of the summands in the denominator we obtain 

Y Kb)is-b)+ Y f^ib){s-b) 

^ s—a<b<a max{a,s—a}<b<s 
s—a<b<a max{a,s—a}<b<s 

(If a < s/2, the first sum in both numerator and denominator is empty.) Now consider 
the quotient of the summands for each b with ^(6) > 0. For s — a < b < a, this quotient is 



n{b)Vab y/a - {s - a + 1) \s-a + l \s-a + l 
For maxja, s — a} < b < s, the quotient of the corresponding summands is 
/i(6)(s — b) ^ min{a, s — a} ■ \fb ^ a ■ ^/s fs 



fiib)ayyVb- a3/2 - a3/2 
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Hence, 7a < \/s/{s — a + 1) + \/s/a. Plugging this bound on 7a and the bound Aa < 1 into 




we obtain 




5. (3.13) 



Summing up the bounds from (j3.9p . (|3.10p . and (|3.13p . we obtain 



E($t - ^t-i I St-i = s)< A(s, 0) + ^ A{s, x) + A{s, s) < 2 + 5 + 



7. 




4. Open problems 

1. We conjecture that the method can be adapted to the continuous case to prove a 
lower bound of r2((log(l/e)^) for approximating the minimum of some unimodal function 
/ by a scale-invariant search strategy (see Section ll.ip . 

2. It is an open problem whether our method can be used to prove a lower bound of 
r2((logn)^) for finding the minimum of an arbitrary unimodal function /: {0, . . . ,n} — > R, 
by a scale invariant search strategy. 
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